Income growth, employment structure transition and the rise of modern markets: The impact of urbanization on residents’ consumption of dairy products in China

In modern society, dairy products have become increasingly important in our diet because of changes in consumption patterns due to urbanization. However, Chinese residents’ dairy consumption remains at a relatively low level, with great potential for growth. Exploring the main determinants of dairy consumption and their effect mechanisms not only helps to improve the health status of residents, but also has important policy implications for the development of China’s dairy industry. Based on the data of China Health and Nutrition Survey (CHNS) from 1989 to 2011, this study empirically analyzes the impact of urbanization on residents’ dairy consumption. The results indicate that urbanization could significantly promote residents’ consumption of dairy products and the effect is higher in areas with low urbanization levels and in midwestern regions than in areas with high urbanization levels and in midwestern regions. From the perspective of effect mechanism, income growth, employment structure transition and the rise of modern markets are three important mediating paths. Additionally, the results imply that in areas with low urbanization levels, income growth and the rise of modern markets are the main significant mediators; while in areas with high urbanization levels, employment structure transition is a significant mediator. Moreover, in midwestern regions, income growth is a significant mediator, and employment structure transition is a significant mediator in all regions. These findings have practical implications for understanding the relationship between urbanization and residents’ food consumption and for further promoting residents’ dairy consumption and the development of China’s dairy industry.


Introduction
Since the reform and opening up, China's urbanization rate has continued to increase, from the 17.92% in 1978 to 51.27% in 2011 and more than 64% in 2021. In the process of urbanization, residents' food consumption patterns have undergone significant changes, which are mainly reflected in the gradual decline in the consumption of cereal grains and continuous increase in the consumption of animal products [1,2]. As animal-derived foods are rich in nutrients, dairy products have become an important part of residents' food consumption. However, the dairy consumption of Chinese residents remains at a relatively low level. Residents' dairy consumption expenditure accounts for only a small percentage of the total food expenditure, and the per capita consumption of dairy products is only one-third of the world average, which indicates great potential for further growth. Therefore, exploring the main determinants of dairy consumption and their effect mechanisms not only helps to improve the health status of residents, but also has important policy implications for the development of China's dairy industry.
Existing empirical research on residents' dairy consumption mainly focuses on residents' consumption characteristics and preferences for dairy products [3][4][5][6][7], the factors affecting dairy consumption [8][9][10][11][12][13][14], prospects for the future consumption of dairy products [15] and the impact of the 2008 milk scandal on residents' dairy consumption behavior [16,17]. However, few studies directly test the relationship between urbanization and residents' dairy consumption; so that is one main objective in this study. In addition, most existing studies residents' dairy consumption only consider urban residents [4,10,12,14,16], but urbanization is a dynamic process, which is not a simple binary transition; accordingly, this study accounts for rural residents as well.
Main aim of this study is to analyze the relationship between urbanization and residents' consumption of dairy products in China using the data collected from 1989-2011 CHNS surveys. We first calculate residents' average daily dairy consumption amount, urbanization rate, and the main key variables in each community. Subsequently, we take both urban and rural residents as research samples to explore the impact of urbanization on residents' dairy consumption. Next, we use a parallel multiple mediator model to analyze the impact paths and mechanisms of urbanization on residents' dairy consumption, and find that income growth, employment structure transition and the rise of modern markets are three significant mediators. Additionally, we divide the samples into community groups with different urbanization levels and groups from different regions to explore the heterogeneity. Last, we use the percentile bootstrap method to check the robustness of the models and instrumental variable method to avoid potential endogeneity problems.
The remaining contexts are organized as follows: Section 2 reviews relevant literature and introduces the study area and proposes the research hypotheses. Section 3 introduces the data and models used in this study. Section 4 conducts the empirical analysis, including the regression of the baseline models, heterogeneity analysis, robustness test and potential endogeneity solving. Section 5 provides the conclusions, discussions and directions for future works.

Literature review and research hypotheses
According to extant literature, urbanization mainly affects residents' dairy consumption through three paths: income growth, employment structure transition, and the rise of modern markets.

Income growth
In most of the studies on dairy consumption, income is found to be an important factor [3,7,[10][11][12][13][14][15]18]. During the urbanization process, a large number of rural laborers migrate to urban areas. Urban local residents can obtain more income through the division of labor and than that in agricultural sectors, and this changes their food consumption habits to some extent, leading to a decrease in the demand for staple food [1,25], and a potential increase in the demand for more nutritional foods such as dairy products.
Accordingly, the increase of urbanization level reduces the participation rate in agricultural sectors, and the reduction of manual labors in agriculture sectors, in turn, boosts residents' demand for dairy products. Therefore, we consider employment structure transition as an important mediation effect.

The rise of retail terminals represented by supermarkets
The process of population gathering in cities also promotes the continuous concentration of retail terminals in cities. Convenient consumption channels will promote the purchase and consumption of dairy products in cities with high urbanization and development level.
In 2011, the total number of chain retail stores in China was 195.8 thousand, approximately tenfold that of the late 1990s, and the 1990s and beginning of the 21st century were periods of rapid urbanization in China (See Fig 1). Areas with high urbanization level have larger and more modern supermarkets [26]. Researchers find that supermarkets and modern retail stores have a significant positive impact on residents' purchases of dairy products [3,10]. Supermarkets can capitalize on large consumers inflows, rich product categories, and cold chain guarantee when competing with traditional retail terminals [27]. Some dairy varieties, such as the pasteurized milk, have relatively high transaction costs (including the cost of frequent purchases and storage costs), and the increasing level of market convenience makes the consumption of these dairy products possible [28]. Moreover, the rise of modern markets has helped dairy products from western China enter the eastern market and broken the monopoly of local dairy processing companies.
Accordingly, the process of urbanization promotes the rise of modern markets, and the development of retail network system further helps promote residents' dairy consumption. Therefore, we consider employment structure transition as an important mediation effect. Fig 1 shows the urbanization process, residents' dairy consumption and the indicators reflecting the three potential impact paths described above, from the 1990s to the beginning of the 21st century. Evidently, residents' income level and the development of modern chain retail enterprises have a common trend with the urbanization process and residents' dairy consumption rise, while the employment in agricultural sectors has an opposite trend.
Based on the above analysis, this study proposes the following hypotheses: H1: Income growth plays a mediating role in the impact of urbanization on residents' dairy consumption. Specifically, the improvement of urbanization promotes residents' income levels, and income growth further promotes residents' consumption of dairy products.
H2: Employment structure transition plays a mediating role in the impact of urbanization on residents' dairy consumption. Specifically, the improvement of urbanization reduces the participation rate in agricultural sectors, and the transition of employment structure boosts residents' demand for dairy products.
H3: The rise of modern markets plays a mediating role in the impact of urbanization on residents' dairy consumption. Specifically, the improvement of urbanization promotes the rise of modern markets, and the rise of modern markets further promotes residents' dairy consumption.
The relationships and influence paths between the variables in the hypotheses are depicted in Fig 2.

Data
The data used in this study are obtained from the China Health and Nutrition Survey (CHNS). The samples in the CHNS comprise populations from provinces in the east, northeast, central and west regions of China, which vary substantially in geography, economic development, public resources, and health indicators; thus, it is nationally representative. For the sample, we select the full period data of 1989, 1991, 1993, 1997, 2000, 2004, 2006, 2009 and 2011. The year 2015 is omitted because the data from the diet survey, which is the important source for our dependent variable, has not been released to the public. All variables in this study are obtained from community-level data or have been aggregated to the community level.

Variables
Dependent variable. The dependent variable is the residents' average daily consumption of dairy products. The CHNS nutrition surveys include the household food survey and the individual diet survey. Both surveys are conducted on three randomly chosen consecutive days of a week at the same time. The household food survey adopts the food inventory method, that is, it records the daily purchasing amount and discarding amount of each food item and the number of people dining in the household in detail. The daily consumption of each food item per person is calculated from the changes in food inventory. Differently, the individual diet survey adopts the 24-hour Dietary Recall. This method requires the respondent to recall the food type, quantity, eating time, dining place, and preparation method within the last 24 hours. Compared with the individual diet survey, the household food survey might be biased if there are guests eating at respondent's home during the survey period; furthermore, the household food survey does not consider the cases of dining out. However, with the development of the economy and the increase of residents' income, dining out has become increasingly frequent for Chinese residents. Therefore, to obtain more precise consumption data, we finally choose the dataset from the individual diet survey based on the 24-hour Dietary Recall.
To identify the dairy products in the CHNS individual diet survey, we refer to the book

PLOS ONE
Key independent variable. The key independent variable in this study is the urbanization level. For the purposes of this study, it is calculated as the proportion of permanent residents with urban hukou i in the total community population. The individuals who have been abroad, moved out to other cities or lived in the local community for less than six months each year are not considered in the calculation.
Mediating variables. Based on the literature review, this study selects three mediating variables to explore the ways in which urbanization affects residents' dairy consumption: income level, employment structure and the rise of modern markets. In this study, Income level is the per capita annual net income at the community level in logarithmic form. Employment structure is defined as the proportion of people engaged in the agriculture sector among the total labor force. We employ the market component scores provided by the CHNS, which considers the types of markets available, distance to the markets in or near the local community, and number of days these markets are open, to evaluate the local development of modern markets in each community [29]. The higher the score a community receives, the better development of the modern markets in the community.
Control variables. This study selects the purchasing power, production value of cow products, proportion of the elderly and children among the total population, education level, and 2008 milk scandal as the control variables.
Purchasing power significantly affects residents' consumption expectations for food [30]. The decline in purchasing power has a negative impact on residents' dairy consumption. In this study, purchasing power is indicated by the Consumer Price Index.
It is difficult to obtain the dairy output data for each community from the CHNS survey. Instead, we consider the total value of the products produced from the third livestock type (i.e., cows and horses).
The increase in the number of elderly people and young children might enhance the demand for dairy products. In this study, the proportion of the elderly and children is calculated by measuring the proportion of residents in community who are older than 65 years and younger than 6 years.
The improvement of education level will improve residents' cognition and consumption preferences for dairy products. In this study, we calculate the residents' average years of education for each community.
Food safety incidents, such as the melamine incident that occurred in 2008, have a direct inhibitory effect on the consumption of dairy products. Therefore, a dummy variable "milk scandal" is added to the control, for which the years before and after 2008 are coded as 0 and 1, respectively. Table 1 provides the definitions of the variables.

Models
First, we construct the following model: where Q it is the average amount of dairy products consumed per person per day in community i in year t, UR it is the proportion of permanent residents with urban hukou in the total population of community i in year t, Contr kit are k controlled variables, Year t is the time effect, and ε it is the random disturbance term.
Additionally, the following models on mediators are used to construct the whole mediating effect model: where M it are m mediating variables (m = 1: income level, 2: employment structure, and 3: the rise of modern markets), and Contr kit are the same controlled variables as in Model 1.
The procedure for testing the mediating effect is as follows [31,32]: First, the direct effect model of the impact of urbanization level and control variables on residents' dairy consumption is regressed (Model 1). If coefficient α 1 is significant, there is a mediating effect; otherwise, there is a suppressing effect because different mediators may have opposite effects. Second, the key independent variable, urbanization level, is regressed on three mediating variables: income level, employment structure and the rise of modern markets (Model 2). Third, the three mediating variables, key independent variable and control variables are added to the whole regression (Model 3). If the coefficients of both β m1 and γ m2 are significant, the indirect effect is significant, and the analysis proceeds to step 5; if at least one of the coefficients is not significant, the analysis proceeds to Step 4. Fourth, the bootstrap method needs to be used to test whether the interaction term (β m1 γ m2 ) is significant. If the interaction term is significant, the indirect effect is significant, and the analysis proceeds to step 5; whereas if it is not significant, the analysis is terminated at this step. Fifth, the coefficient of urbanization level γ m1 in Model 3 is tested. If γ m1 is significant, the direct effect is significant, and the analysis proceeds to the final step. If γ m1 is not significant, then no direct effect exists, indicating that there is only a mediating effect. Lastly, the signs of the interaction term β m1 γ m2 and coefficient γ m1 are compared: if the signs are the same, there is a complementary mediating effect; if the signs are different, there is a competitive effect. This procedure is proposed by Zhao et al. (2010), which is improved on the basis of Baron-Kenny procedure and considers the bootstrap test as a supplement reference. Table 2 presents the descriptive statistics for each variable from 1989 to 2011. A total of 1,036 observations are made. It can be seen that the average residents' dairy consumption is 171.06 grams per day, and the average urbanization level is around 0.58. The urbanization level in the

Main results
The empirical analysis is mainly using the statistical software Stata version 14.0. First, we start from the Model 1. This model is used to examine the direct effect of urbanization level on residents' dairy consumption. In addition to the time fixed effect, we include the province as a fixed effect to control the provincial effect. The empirical results of Model 1 are presented in Column 1 of Table 3. The results show that the coefficient of the key independent variable, urbanization level (α 1 ), is positive but insignificant. From the literature discussion above, the increase in income level and transition in employment structure may have opposite effects on dairy consumption, so the insignificance of the coefficient is reasonable. This indicates that the increase in urbanization rate is related to an increase in residents' dairy consumption, but the specific impact mechanism needs to be further explored. Then, we further regress urbanization level on the three mediating variables: income level, employment structure and the rise of modern markets (see Table 4). It can be seen that the coefficients of all mediating variables (β 11 , β 21 , β 31 ) are significant. This indicates that the increase in urbanization rate is related to a significant increase in residents' income level, decrease in agricultural engagement, and improvement in the local development of modern markets.
Next, the mediating variables are added to the regression on residents' dairy consumption. Columns 2-4 in Table 3 present the relevant results. Combined with the results in Table 4, the coefficients of the income level, β 11 and γ 12 , are both significantly positive, indicating that the development of urbanization enhances residents' income levels, and the increase in income

PLOS ONE
further promotes the consumption of dairy products. This implies that the "urbanization level-income level-residents' dairy consumption" path is a significant effect path, and the mediating effect equals 3.83 (20.499×0.187). Similarly, the coefficients of the rise of modern markets, β 31 and γ 32 , are both significantly positive, indicating that the "urbanization level-the rise of modern markets-residents' dairy consumption" path is a significant effect path, and the mediating effect equals 4.39 (1.341×3.272). On the contrary, the coefficients of the employment structure, β 21 and γ 22 , are both significantly negative, indicating that the increase of urbanization level reduces the participation rate in agricultural sectors, and the reduction of manual labors in agriculture sectors, in turn, boosts residents' demand for dairy products. This implies that the "urbanization level-employment structure-residents' dairy consumption" effect path is also significant, and the mediating effect equals 22.37 ((-30.726)×(-0.728)). We proceed to check the coefficient γ m1 in Model 3. Table 3 shows that both the value and the significance of coefficient γ m1 decrease after the mediating variables are added to the model. Specifically, the coefficient γ 21 decreases to a significant negative value with an opposite sign to β 11 γ 12 , indicating a competitive mediating effect for the "urbanization level-employment structure-residents' dairy consumption" path. The coefficients γ 11 and γ 31 become insignificant, indicating an indirect-only mediation for the "urbanization level-income levelresidents' dairy consumption" path and the "urbanization level-the rise of modern marketsresidents' dairy consumption" paths.

PLOS ONE
Accordingly, the empirical analysis supports all the hypothesis put forward when considering the full-size sample.

Heterogeneity analysis
The impact of urbanization on residents' dairy consumption and its effect mechanisms may differ between areas with different urbanization levels and between different geographic regions of China. To further explore the potential heterogeneity, we divide the samples into high-urbanization-level and low-urbanization-level community groups based on the average urbanization level, as well as community groups from eastern and midwestern regions, and then repeat the above process.
For the insignificant mediation path in regression, the bootstrap test is used to verify its mediation effect. The steps of the percentile bootstrap test are as follows [33,34]: First, random repeated sampling with replacement is implemented based on the original sample to create a new sample. Then, the equations for the dependent variable, with and without mediating variables, are estimated for each bootstrap sample, allowing estimation ofb m1 ;ĝ m2 , andb m1ĝm2 . The above process is repeated B times (usually, B = 5000 times) to obtain the estimation of B mediating effects. After that, the B mediating effects are sorted from small to large and yield sequence C. Finally, the 2.5 th and 97.5 th percentiles of sequence C are adopted to estimate the 95% confidence interval of the mediating effect. If the confidence interval does not contain the value 0, the mediating effect is significant; otherwise, it is not significant. We use the statistical software SPSS version 18.0 with the Process plug-in version 3.2 written by Andrew F. Hayes to complete the test. Table 5 presents the results of each mediating path for community groups with different urbanization groups. The results show that in the low-urbanization-level group, "urbanization

PLOS ONE
level-income level-residents' dairy consumption" path and "urbanization level-rise of modern markets-residents' dairy consumption" path are the main mediation paths; while in the high-urbanization-level group, "urbanization level-employment structure-residents' dairy consumption" path is the main mediating path. When in the stage of low urbanization, the income level of residents is relatively low, and residents have limited access to various dairy products; therefore, the increase in income and rise in modern markets represented by supermarkets have a greater impact on the expansion of sales radius and the increase in residents' consumption of dairy products. With the advancement of urbanization, a large number of laborers move out from agriculture sectors, and their consumption habits also change; then, the role of income levels and modern markets will decline. Therefore, to increase residents' dairy consumption, more attention should be paid to the improvement of residents' income level and ensuring residents' consumption accessibility in the low urbanization level stage. The changes in consumption habits should be based on ensuring residents' consumption capacity. As for the total indirect effect ( P 3 m¼1 b m1 g m2 ), it is 16.56 for the areas with high urbanization levels and 24.42 for the areas with low urbanization levels. This implies that the impact of urbanization paths on residents' dairy consumption is stronger at the initial urbanization development stage, and the impact has a marginal decline tendency.
The eastern communities are defined as communities from Beijing, Shanghai, Jiangsu, Shandong, Liaoning, and Heilongjiang provinces, and the midwestern communities are defined as communities from Henan, Hubei, Hunan, Guangxi, Guizhou, and Chongqing provinces. The total indirect effect is 48.80 for the eastern region group, much higher than that for the midwestern region group (24.72). Table 6 presents the results of each mediating path for communities from different regions. The results indicate that the "urbanization levelincome level-residents' dairy consumption" path is a significant mediating path for communities in midwestern regions, but not for those in eastern regions. The average urbanization rate of the communities in midwestern China is 0.55, which is lower than that in eastern regions (0.60), so the result of this path is consistent with the result in low-urbanization-level rate samples. Moreover, "urbanization level-employment structure-residents' dairy consumption" mediation path is significant in both regions, but plays a larger mediating role in eastern regions. The effect of the "urbanization level-rise of modern markets-residents' dairy

PLOS ONE
consumption" path does not differ significantly across regions. To increase residents' dairy consumption, increasing residents' income is still an effective way in midwestern China, and encouraging the transfer of agricultural manual labor is an effective approach that works in any region.

Robustness test
To check the robustness of the models, we also add the three mediating variables simultaneously to the model in Model 3. The results show that the coefficients of income level and employment structure are still significant, but the coefficient of the rise of modern markets becomes insignificant (Column 5 in Table 3). Therefore, we further use the bootstrap method to test the robustness of the mediating effect through "urbanization level-rise of modern markets-residents' dairy consumption" path. Table 7 presents the test results of the mediating effect using the percentile bootstrap method. The results show that the bootstrap confidence interval of the total mediating effect does not contain 0, which means that the total mediating effect, considering all three paths, is significant. Specific to each path, the confidence intervals of "urbanization level-income levelresidents' dairy consumption" path and "urbanization level-employment structure-residents'

PLOS ONE
dairy consumption" path do not contain 0, and thus, the mediating effects the two paths are significant. However, the confidence intervals of the "urbanization level-rise of modern markets-residents' dairy consumption" path contain 0, indicating that the mediating effect is not significant from the bootstrap test. Therefore, the significance of the mediation of the rise of modern markets is not robust. Considering that the main dairy products consumed by Chinese residents are ultra-high-temperature processed milk (see Fig 3), which is less dependent on storage conditions, the rise of modern markets may have a limited effect on residents' dairy consumption.
Additionally, the bootstrap test compares the effect sizes of the three mediation paths. The results show that the mediating effect of employment structure is significantly higher than that of income level and the rise of modern markets, while there is no significant difference between income level and the rise of modern markets.

Potential endogeneity
In order to avoid potential endogeneity problems and non-robust results caused by the estimation method, we adopt the instrumental variable method (IV-2SLS) to further investigate the effect of urbanization on residents' dairy consumption. We employ the housing condition score and the sanitation score for each community provided by CHNS as the instrumental variables. The housing condition score considers the electricity availability, the possession of indoor tap water and flush toilets, and natural gas usage. The sanitation score considers the proportion of households with treated water and prevalence of households without excreta present outside the home in the community. The two indexes are important indicators of urbanization level, but they are irrelevant with the dairy consumption. Therefore, the two indexes can be the instrumental variables for urbanization level. Table 8 presents the results of Model 1 estimated by IV-2SLS method. From the first-stage estimation results, it can be seen that both the housing condition and the sanitation condition have a significant positive impact on urbanization level, so there should be no problem with weak instrumental variables, and the Cragg-Donald Wald F statistic further rejects the hypothesis of weak instrumental variables. The coefficient α 1 of both instrumental variables is significantly positive, indicating that the increase in urbanization rate is related to a significant increase in residents' dairy consumption. Moreover, the significance and signs of other model coefficients are basically consistent with the baseline models. Therefore, the core conclusions of this study still hold.

Conclusions and discussion
This study uses income growth, employment structure transition, and the rise of modern markets as mediating variables, and data from the 1989-2011 CHNS to analyze the impact of urbanization on residents' dairy consumption in China. The results indicate that urbanization has a significant effect on the promotion of residents' dairy consumption. Primarily, income growth, employment structure transition and the rise of modern markets mediate the effects of urbanization on residents' dairy consumption, and all these mediating variables play a positive role. However, the significance of the rise of modern markets is not robust, which can be explained by Chinese consumers' consumption structure of dairy products. Further, this study explores the heterogeneity in areas with different urbanization levels and in different geographic regions. The results imply that the impact of urbanization on residents' dairy consumption is larger in areas with high urbanization levels and in eastern regions than areas with low urbanization levels and in midwestern regions. Moreover, income growth is the main significant mediator in areas with low urbanization levels and in midwestern regions, employment structure transition is a significant mediator in areas with high urbanization levels and in

PLOS ONE
all regions, and the rise of modern markets is also a significant mediator in areas with low urbanization levels. The findings of this study have practical implications for understanding the relationship between urbanization and residents' food consumption and for further promoting residents' dairy consumption and the development of China's dairy industry. First, this study proves that the process of urbanization has a positive impact on residents' dairy consumption, which indirectly verifies that urbanization will change residents' food consumption patterns and even their health status. Therefore, to increase residents' consumption of dairy products and improve their health, the Chinese government needs to promote the urbanization process. Second, from the perspective of the impact path and effect mechanism of urbanization on residents' dairy consumption, we recommend focusing on improving residents' income level, employment structure, and the development of modern markets to further increase the consumption level of dairy products. Third, in areas with low urbanization levels, more emphasis should be placed on increasing residents' disposable income and the development of the retail network, to increase residents' consumption capacity and accessibility to various dairy products. In areas with high urbanization levels, the government should focus more on improving employment structure to change residents' consumption habits; this will help better promote residents' consumption of dairy products. Last, encouraging the transfer of agricultural manual laborers to change their consumption habits is an effective way to increase residents' dairy consumption in all regions.
In this study, the historical consumption data of dairy products is not continuous, only updated to 2011, and the distribution of consumption data is uneven, which may affect the accuracy of the model establishment. In future work, we will consider using datasets with longer continuous time spans and wider geographical distribution to obtain accurate and precise results.